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We propose an all-digital telescope for 21 cm tomography, which combines key advantages of both 
single dishes and interferometers. The electric field is digitized by antennas on a rectangular grid, 
after which a series of Fast Fourier Transforms recovers simultaneous multifrequency images of up 
to half the sky. Thanks to Moore's law, the bandwidth up to which this is feasible has now reached 
about 1 GHz, and will likely continue doubling every couple of years. The main advantages over a 
single dish telescope are cost and orders of magnitude larger field-of-view, translating into dramat- 
ically better sensitivity for large-area surveys. The key advantages over traditional interferometers 
are cost (the correlator computational cost for an A-element array scales as N log 2 N rather than 
N 2 ) and a compact synthesized beam. We argue that 21 cm tomography could be an ideal first 
application of a very large Fast Fourier Transform Telescope, which would provide both massive 
sensitivity improvements per dollar and mitigate the off-beam point source foreground problem 
with its clean beam. Another potentially interesting application is cosmic microwave background 
polarization. 



I. INTRODUCTION 

Since Galileo first pointed his telescope skyward, de- 
sign innovations have improved attainable sensitivity, 
resolution and wavelength coverage by many orders of 
magnitude. Yet we are still far from the ultimate tele- 
scope that simultaneously observes light of all wave- 
lengths from all directions, so there is still room for im- 
provement. 

From a mathematical point of view, telescopes are 
Fourier transformers. We want to know individual 
Fourier modes k of the electromagnetic field, as their 
direction k encodes our image and their magnitude k = 
uj/c = 2tt/X encodes the wavelength, but the field at a 
given spacetime point (r, t) tells us only a sum of all these 
Fourier modes weighted by phase factors e 4 [ k r + wt l. 

Traditional telescopes perform the spatial Fourier 
transform from r-space to k-space by approximate analog 
means using lenses or mirrors, which are accurate across 
a relatively small field of view, and perform the tempo- 
ral Fourier transform from t to w using slits, gratings or 
band-pass filters. Traditional interferometers used analog 
means to separate frequencies and measure electromag- 
netic field correlations between different receivers, then 
Fourier-transformed to r-space digitally, using comput- 
ers. In the tradeoff between resolution, sensitivity and 
cost, single dish telescopes and interferometers are highly 
complementary, and which is best depends on the science 
goal at hand. 

Thanks to Moore's law, it has very recently become 
possible to build all-digital interferometers up to about 
1 GHz, where the analog signal is digitized right at each 
antenna and subsequent correlations and Fourier trans- 
forms are done by computers. In addition to reducing 
various systematic errors, this digital revolution enables 
the "Fast Fourier Transform Telescope" or "omniscope" 



that we describe in this paper. We will show that it 
acts much like a single dish telescope with a dramati- 
cally larger field of view, yet is potentially much cheaper 
than a standard interferometer with comparable area. If 
a modern all-digital interferometer such as the MWA [1] 
is scaled up to a very large number of antennas N, its 
price becomes completely dominated by the computing 
hardware cost for performing of order N 2 correlations 
between all its antenna pairs. The key idea behind the 
FFT Telescope is that, if the antennas are arranged on a 
rectangular grid, this cost can be cut to scale merely as 
A" log 2 A" using Fast Fourier Transforms. As we will see, 
this design also eliminates the need for individual anten- 
nas that are pointable (mechanically or electronically), 
and has the potential to dramatically improve the sensi- 
tivity for some applications of future telescopes like the 
square kilometer array without increasing their cost. 

This basic idea is rather obvious, so when we had it, 
we wondered why nothing like the massive all-sky low- 
frequency telescope that we are proposing had ever been 
built. We have since found other applications of the idea 
in the astronomy and engineering literature dating as far 
back as the early days of radio astronomy [8-17], and it is 
clear that the answer lies in lack of both computer power 
and good science applications. Moore's law has only re- 
cently enabled A/D conversion up to the GHz range, so 
in older work, Fourier transforms were done by analog 
means and usually in only one dimension (e.g., using a 
so-called Butler matrix [8]), severely limiting the num- 
ber of antennas that could be used. For example, the 45 
MHz interferometer in [9] used six elements. Moreover, 
to keep the number of elements modest while maintain- 
ing large collecting area, the elements themselves would 
be dishes or interconnected antennas that observed only 
a small fraction of the sky at any one time. A Japanese 
group worked on an analog 8x8 FFT Telescope about 
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15 years ago for studying transient radio sources [10, 11], 
and then upgraded it to digital signal processing aiming 
for a 16 x 16 array with a field of view just under 1° . Elec- 
tronics from this effort is also used in the 1-dimensional 
8-element Nasu Interferometer [14]. 

Most traditional radio astronomy applications involve 
mapping objects subtending a small angle surrounded by 
darker background sky, requiring only enough sensitivity 
to detect the object itself. For most such cases, conven- 
tional radio dishes and interferometers work well, and an 
FFT Telescope (hereafter FFTT) is neither necessary nor 
advantageous. For the emerging field of 21 cm tomogra- 
phy, which holds the potential to one day overtake the 
microwave background as our most sensitive cosmologi- 
cal probe [18-24], the challenge is completely different: 
it involves mapping a faint and diffuse cosmic signal that 
covers all of the sky and needs to be separated from fore- 
ground contamination that is many orders of magnitude 
brighter, requiring extreme sensitivity and beam control. 
This 21cm science application and the major efforts de- 
voted to it by experiments such as MWA [1], LOFAR [2], 
PAPER[4], 21CMA [3], GMRT [5, 6] and SKA [7] makes 
our paper timely. 

An interesting recent development is a North Amer- 
ican effort [15, 16] to do 21 cm cosmology with a one- 
dimensional array of cylindrical telescopes that can be 
analyzed with FFT's, in the spirit of the Cambridge 1.7m 
instrument from 1957, exploiting Earth rotation to fill in 
the missing two-dimensional information [15, 16]. We will 
provide a detailed analysis of this design below, arguing 
that is is complementary to the 2D FFTT at higher fre- 
quencies while a 2D FFTT provides sharper cosmological 
constraints at low frequencies. 

The rest of this paper is organized as follows. In Sec- 
tion II, we describe our proposed design for FFT Tele- 
scopes. In Section III, we compare the figures of merit 
of different types of telescopes, and argue that the FFT 
Telescope is complementary to both single dish telescopes 
and standard interferometers. We identify the regimes 
where each of the three is preferable to the other two. In 
Section IV, we focus on the regime where the FFT Tele- 
scope is ideal, which is when you have strong needs for 
sensitivity and beam cleanliness but not resolution, and 
argue that 21 cm tomography may be a promising first 
application for it. We also comment briefly on cosmic 
microwave background applications. We summarize our 
conclusions in Section V and relegate various technical 
details to a series of appendices. 



II. HOW THE FFT TELESCOPE WORKS 

In this section, we describe the basic design and data 
processing algorithm for the FFT Telescope. We first 
summarize the relevant mathematical formalism, then 
discuss data processing, and conclude by discussing some 
practical issues. For a comprehensive discussion of radio 
interferometry techniques, see e.g. [25]. 



A. Interferometry without the fiat sky 
approximation 

Since the FFT Telescope images half the sky at once, 
the flat-sky approximation that is common in radio as- 
tronomy is not valid. We therefore start by briefly sum- 
marizing the general curved-sky results formalism. Sup- 
pose we have a set of antennas at positions r„ with sky 
responses B„(k) at a fixed frequency u = ck, n = 1, 
and a sky signal s(k) from the direction given by the unit 
vector — k (this radiation thus travels in the direction 
+k). The data measured by each antenna in response to 
a sky signal s(k) is then 

d„ = J e-^+^B n (k)s(k)dn k . (1) 

Details related to polarization are covered below in Ap- 
pendix A, but are irrelevant for the present section. For 
now, all that matters is that s(k) specifies the sky signal, 

d„ specifies the data that is recorded, and B„(k) speci- 
fies the relation between the two. 1 Specifically, s is the 
so-called Jones vector (a 2-dimensional complex vector 
field giving the electric field components - with phase - 
in two orthogonal directions), d„ is a vector containing 
the two complex numbers measured by the antenna, and 
B„(r), the so-called primary beam, is a 2 x 2 complex 
matrix field that defines both the polarization response 
and the sky response (beam pattern) of the antenna. The 
only properties of equation (1) that matter for our deriva- 
tion below are that it is a linear relation (which comes 
from the linearity of Maxwell's equations) and that it 
contains the phase factor e~ lk ' r ™ (which comes from the 
extra path length k • r„ that a wave must travel to get to 
the antenna location r„). 

The sky signal s(k) has a slow time dependence be- 
cause the sky rotates overhead, because of variable as- 
tronomical sources, and because of distorting atmo- 
spheric/ionospheric fluctuations. However, since these 
changes are many orders of magnitude slower than the 
electric field fluctuation timescale we can to an ex- 
cellent approximation treat equation (1) as exact for a 
snapshot of the sky. Below we derive how to recover the 
snapshot sky image from these raw measurements; only 



1 If one wishes to observe the sky at frequencies v higher than cur- 
rent technology can sample directly (> 1 GHz), then one can ex- 
tract a bandwidth Av < 1 GHz in this high frequency range using 
standard radio engineering techniques (first an analog frequency 
mixer multiplies the input signal with that from a local oscil- 
lator, then an analog low-pass-filtcr removes frequencies above 
Av, and finally the signal is A/D converted). The net effect of 
this is simply to replace e~'"* in equation (1) by e~ w o)t f or 
some conveniently chosen local oscillator frequency coo = 2ttvq. 
It is thus the bandwidth Av rather than the actual frequencies 
v that are limited by Moore's Law. 
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when coadding different snapshots does one need to take 
sky rotation and other variability into account. 

The statements above hold for any telescope array. For 
the special case of the FFT Telescope, all antennas have 
approximately identical beam patterns (B„ = B) and lie 
in a plane, which we can without loss of generality take 
to be the z = plane so that z • r„ = 0. Using the fact 
that 



dilk = sin 8d9d<fi 



dkxdku 



ky/k 2 - k±< 



(2) 



where fcx = \Jk 2 + k 2 is the length of the component of 

the k-vector perpendicular to the z-axis, we can rewrite 
equation (1) as a 2-dimensional Fourier transform 



^A=iB(x„)e- (3) 
k\/ k z — q z 



where we have defined the 2-dimcnsional vectors 



q = 



and the function 



, , > B(q)s(q) 

SB(q) = 



(4) 



(5) 



Here the 2-dimensional function s(q) is defined to equal 
s(q x ,q v ,-[k 2 - ql - <^] 1/2 ) when q = |q| < k, zero oth- 
erwise, and B(q) is defined analogously, sb can there- 
fore be thought of as the windowed, weighted and zero- 
padded sky signal. Equation (3) holds under the assump- 
tion that B(k) vanishes for k z > 0, i.e., that a ground 
screen eliminates all response to radiation heading up 
from below the horizon, so that we can limit the inte- 
gration over solid angle to radiation pointed towards the 
lower hemisphere. Note that for our application, the sim- 
ple Fourier relation of equation (3) is exact, and that none 
of the approximations that are commonly used in radio 
astronomy for the so-called "w-term" (see Equation 3.7 
in [25]) are needed. 

One usually models the fields arriving from different 
directions as uncorrelated, so that 



(s(k)s(k') f ) = <5(k,k')S(k), 



(6) 



where S(k) is the 2x2 sky intensity Stokes matrix and 
the spherical <5-function satisfies 



5{k,k') = 5(q-q')kJk 2 -k ± 



(7) 



so that J S(k, k.')g(k')dil' k = g(k) for any function g. 
Combining equation (3) with equation (6) implies that 
the correlation between two measurements, traditionally 
referred to as a visibility, has the expectation value 



(d m dt) = J 

= S B (x m - x„) 



lq .(x m -x„) B(q)tS(q)B(q) i2g 



ky/ k 2 — q 2 



where S_b(x) is the Fourier transform of: 

B(q)tS(q)B(q) 

ky/k 2 — q 2 



(9) 



is the beam-weighted, projection-weighted and zero- 
padded sky brightness map. 

In summary, things arc not significantly more com- 
plicated than in standard intcrfcromctry in small sky 
patches (where the flat sky approximation is customarily 
made). One can therefore follow the usual radio astron- 
omy procedure with minimal modifications: first measure 
Sb(Ax) at a large number of baselines Ax corresponding 
to different antenna separations x m — x„ , then use these 
measurements to estimate the Fourier transform of this 
function, Ss(q), and finally recover the desired sky map 
S by inverting equation (9): 



S(q) - fc v ^^B(q)-tS B (q)B(q)- 



B. FFTT analysis algorithm 



(10) 



(8) 



Equation (8) shows that the Fourier transformed 
beam-convolved sky Sb is measured at each baseline, 
i.e., at each separation vector x TO — x„ for an antenna 
pair. A traditional correlating array with iV a antennas 
measures all N a (N a — l)/2 such pairwise correlations, and 
optionally fills in more missing parts of the Fourier plane 
exploiting Earth rotation. Since the cost of antennas, am- 
plifiers, A/D-converters, etc. scales roughly linearly with 
N a , this means that the cost of a truly massive array (like 
what may be needed for precision cosmology with 21cm 
tomography [24]) will be dominated by the cost of the 
computing power for calculating the correlations, which 
scales like N 2 . 

For the FFT Telescope, the N a antenna positions r„ 
are chosen to form a rectangular grid. This means that 
the all N a (N a — l)/2 ~ N 2 baselines also fall on a rectan- 
gular grid, typically with any given baseline being mea- 
sured by many different antenna pairs. 

The sums of d m djj for each baseline can be computed 
with only of order N a log 2 N a (as opposed to N 2 ) op- 
erations by using Fast Fourier Transforms. Essentially, 
what we wish to measure in the Fourier plane are the 
antenna measurements (laid out on a 2D grid) convolved 
with themselves, and this naively N 2 convolution can be 
reduced to an FFT, a squaring, and an inverse FFT. 

In fact, equation (3) shows that after FFT-ing the 2D 
antenna grid of data d„, one already has the two electric 
field components sb (q) from each sky direction, and can 
multiply them to measure the sky intensity from each 
direction (Stokes /, Q, U and V) without any need to re- 
turn to Fourier space, as illustrated in Figure 1. This pro- 
cedure is then repeated for each time sample and each fre- 
quency, and the many intensity maps at each frequency 
are averaged (after compensating for sky rotation, iono- 
spheric motion, etc.) to improve signal-to-noise. 
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FIG. 1: When the antennas are arranged in a rectangular grid as in 
the FFT Telescope, the signal processing pipeline can be dramati- 
cally accelerated by eliminating the correlation step (indicated by 
a sad face) : its computational cost scales as , because it must be 
performed for all pairs of antennas, whereas all other steps shown 
scale linearly with N a • The left and right branches recover the same 
images on average, but with slightly different noise. Alternatively, 
if desired, the FFT Telescope can produce images that are mathe- 
matically identical to those of the right branch (while retaining the 
speed advantage) by replacing the correlation step marked by the 
sad face by a spatial FFT, "squaring," and an inverse spatial FFT. 



It should be noted that the computational cost for the 
entire FFT Telescope signal processing pipeline is (up to 
some relatively unimportant log factors) merely propor- 
tional to the total number of numbers measured by all 
antennas throughout the duration of the observations. In 
particular, the time required for the spatial FFT oper- 
ations is of the same order as the time required for the 
time-domain FFT's that are used to separate out the 
different frequencies from the time signal using standard 
digital filtering. If the antennas form an n x x n y rectangu- 
lar array, so that N a = n x n y , and each antenna measures 
Tit different time samples (for a particular polarization), 
then it is helpful to imagine this data arranged in a 3- 



dimensional n x xn y x n t block. The temporal and spatial 
FFT's (left branch in Figure 1) together correspond to a 
3D FFT of this block, performed by three 1-dimensional 
FFT operations: 

1. For each antenna, FFT in the i-direction. 

2. For each time and antenna row, FFT in the re- 
direction. 

3. For each time and antenna column, FFT in the y- 
direction. 

One processes one such block for each of the two polar- 
izations. These three steps each involve of order ntn x n y 
multiplications (up to order-of- unity factors lognt, logn^ 
and log n y ), and it is easy to show that the number 
of operations for the three steps combined scales as 
(ntn x n y ) \og(n t n x n y ) , i.e., depends only on the total 
amount of data n t n x n y . After step 3, one has the two 
electric field components from each direction at each fre- 
quency. Phase and amplitude calibration of each an- 
tenna/amplifier system is normally performed after step 
1. If one is interested in sharp pulses that are not well- 
localized in frequency, one may opt to skip step 1 or 
perform a broad band-pass filtering rather than a full 
spectral separation. 

The FFT Telescope cuts down not only on CPU time, 
but also on data storage costs, since the amount of data 
obtained at each snapshot scales as number of time sam- 
ples taken times N a rather than N%. 

In a conventional interferometer, antennas are corre- 
lated only with other antennas and not with themselves, 
to eliminate noise bias. This can be trivially incorpo- 
rated in the FFTT analysis pipeline as well by setting 
the pixel at the origin of the UV plane (corresponding to 
zero baseline) to zero, and is mathematically equivalent 
to removing the mean form the recovered sky map. 



C. Practical considerations 

Although we have laid out the mathematical and com- 
putational framework for an FFT Telescope above, there 
are a number of practical issues that require better under- 
standing before building a massive scale FFT Telescope. 

As we will quantify in Section III below, the main ad- 
vantages of an FFT Telescope relative to single dish tele- 
scopes and conventional interferometers emerge when the 
number of antennas N a is very large. A successful FFTT 
design should therefore emphasize simplicity and mass- 
production, and minimize hardware costs. To exploit 
the FFT data processing speedup, care must be taken to 
make the antenna array as uniform as possible. The loca- 
tions Vi of the antennas need to be kept in a planar rect- 
angular grid to within a small fraction of a wavelength, 
so when selecting the construction site, it is important 
that the land is quite flat to start with, that bulldozing 
is feasible, and that there are no immovable obstacles. It 
is equally important that the sky response B(r) be close 
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to identical for all antennas. A ground screen, which 
can simply consist of cheap wire mesh laid out flat under 
the entire array, should therefore extend sufficiently far 
beyond the edges of the array that it can to reasonable 
accuracy be modeled as an infinite reflecting plane, af- 
fecting all antennas in the same way. The sky response 
B(r) of an antenna will also be affected by the presence 
of neighbors: whereas the response of antennas in the 
central parts of a large array will be essentially identi- 
cal to one another (and essentially identical to that for 
an antenna in the middle of an infinite array), antennas 
near the edges of the array will have significantly different 
response. Instead of complicating the analysis to incor- 
porate this, it is probably more cost effective to surround 
the desired array with enough rows of dummy antennas 
that the active ones can be accurately modeled as be- 
ing in an infinite array. These dummy antennas could 
be relatively cheap, as they need not be equipped with 
amplifiers or other electronics (merely with an equivalent 
impedance), and no signals are extracted from them. 

The FFT algorithm naturally lends itself to a a rect- 
angular array of antennas. However, this rectangle need 
not be square; we saw above that the processing time 
is independent of the shape of the rectangle, depending 
only on the total number of antennas, and below we will 
even discuss the extreme limit where the telescope is one- 
dimensional. Another interesting alternative to a square 
FFTT telescope is a circular one, consisting of only those 
7r/4 w 79% of the antennas in the square grid that lie 
within a circle inscribed in the square. This in no way 
complicates the analysis algorithm, as the FFT's need to 
be zero-padded in any case, and increases the computa- 
tional cost for a given collecting area by only about a 
quarter. The main advantage is a simple rotationally in- 
variant synthesized beam as discussed below. Antennas 
can also be weighted in software before the spatial FFT 
do create beams with other desired properties; for exam- 
ple, edge tapering can be used to make the beam even 
more compact. A third variant is to place the antennas 
further apart to gain resolution at the price of undersam- 
pling the Fourier plane and picking up sidelobes. 



III. COMPARISON OF DIFFERENT TYPES OF 
TELESCOPES 

A. Telescopes generalized 

In this section, we compare the figures of merit (res- 
olution, sensitivity, cost, etc.) of different types of tele- 
scopes, summarized in Table 1, and argue that the FFT 
Telescope is complementary to both single dish telescopes 
and standard interferometers. We identify the regimes 
where each of the three is preferable to the other two, as 
summarized in Figure 2. 

It is well-known that all telescopes can be analyzed 
within a single unified formalism that characterizes their 
linear response to sky signals and their noise properties. 
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FIG. 2: Angular resolution and sensitivity are compared for 
different telescope designs, assuming that half the sky is sur- 
veyed during 4000 hours at an observing frequency 150 MHz, 
with 0.1 % bandwidth and 200K system temperature. Since 
ST e = ^Z 2 C%° 1bc /2-k, ST e = lOmK at £ = 1000 corresponds 
to 10fiK on the vertical axis. The parameters of any ana- 
log telescope (using focusing optics rather than digital beam- 
forming) lie in the upper right triangle between the limiting 
cases of the single receiver telescope (SRT; heavy horizontal 
line) and the single dish telescope with a maximal focal plane 
(MFPT; heavy line of slope 2/3). The parameters of a Fast 
Fourier transform telescope (FFTT) lie on the heavy hori- 
zontal line of slope -I, with solid squares corresponding to 
squares FFTTs of side 10m, 100m and 1000m, respectively. 
Moving their antennas further apart (reducing f COVCI with A 
fixed) would move these squares along a 45° line up to the 
right. Improved sensitivity at fixed resolution can be attained 
by building multiple telescopes (thin parallel lines correspond 
to 2, 3,. ..,10 copies). As explained in the text, SDTs, SITs 
and FFTTs are complementary: the cheapest solution is of- 
fered by SDTs for low resolution, FFTTs for high sensitivity 
(C*o oisc ) /2 < 9 x 2/xK, and elongated FFTTs or standard 
interferometers for high resolution < (C$ oisc ) 1/2 /2fiK. 



In particular, a single dish telescope can be thought of as 
an interferometer, where every little piece of the collect- 
ing area is an independent antenna, and the correlation 
is performed by approximate analog means using curved 
mirrors. This eliminates the costly computational step, 
but the approximations involved are only valid in a lim- 
ited field of view (Table 1). Traditional interferometers 
can attain larger field of view and better resolution for a 
given collecting area, but at a computational cost. The 
FFT Telescope is a hybrid of the two in the sense that 
it combines the resolution of a single dish telescope with 
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Table 1 - How telescope properties scale with dish size D, collecting area A and wavelength A. We assume that the standard 
interferometer has N a ~ A/D 2 separate dishes with a maximum separation D max that together cover a fraction 
^covcr _ A/D 2 m ^ of the total array region rather uniformly. 
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the all-sky field-of-view of a dipole interferometer — at 
a potentially much lower cost than either a single dish or 
a traditional interferometer of the same collecting area. 
Let us now quantify these statements, starting with an- 
gular resolution and its generalization and then turning 
to sensitivity and cost. We first briefly review some well- 
known radio astronomy formalism that is required for our 
applications. 



B. Angular resolution and the beam function Be 

The angular resolution of the telescopes we will com- 
pare are all much better than a radian, so we can approx- 
imate the sky as flat for the purposes of this section. If 
we ignore polarization, then it is well-known that the re- 
sponse of an interferometer to radiation intensity coming 
from near the local zenith 2 and traveling in the direc- 
tion k is W(k x , k y ), the inverse Fourier transform of the 
function W(Ax) that gives the distribution of baselines 
Ax. For the classic example of a single dish telescope of 
radius R, this formula gives 



2J 1 (Rk 1 
Rk ± 



(11) 



the famous Airy pattern plotted in Figure 3. Here 
k± = (fc 2 . + ky) 1 / 2 = 2ir6/\, where 9 is the angle to the 
zenith. When the beam is asymmetric, we will mainly be 
interested in the azimuthally averaged beam which again 
depends only on 9; the result for a square telescope like 
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FIG. 3: The familiar Airy pattern that constitutes the sky response 
of a circular telescope dish of diameter D is compared with the 
azimuthally averaged response of a square telescope and one with 
a Gaussian tapered aperture. The square has side 0.87D to have 
the same FWHM, and the Gaussian has standard deviation 0.45D 
to give comparable response for 8D/X <C 1. 



the fully instrumented FFTT plotted for comparison 3 . 
The figure also shows a Gaussian beam, which may be a 
better approximation for an optical telescope when the 



2 If we are imaging objects much smaller than a radian centered 
at a zenith angle 8, we recover the same formula as above, but 
with the synthesized beam compressed by a factor cos in one 
direction, as the source effectively sees the array at a slanting 
angle, compressed by cos 8 in one direction. 



3 For a telescope with a square dish of side D = 2R, convolving 
the square with itself gives the baseline distribution 

W(Ax) oc (2R - \Ax\)(2R - \Ay\) (12) 
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FIG. 4: The angle-averaged UV plane sensitivity fi (the relative 
number of baselines width different lengths) is compared for tele- 
scopes of different shapes. 



seeing is poor. 

For these three cases, the shapes are seen to be suffi- 
ciently similar that, for many purposes, all one needs to 
know about the beam can be encoded in a single num- 
ber specifying its width. The most popular choices in 
astronomy are summarized in Table 2: the rms (the root- 
mean-squared value of 9 averaged across the beam), the 
FWHM (twice the 9- value where B(9) has dropped to 
half its central value) and the first null (the smallest 9 at 
which B(9) — 0). We will mainly focus on the FWHM 
in our cost comparison below. 

The primary beam B(q) that was introduced in Sec- 
tion II A can itself be derived from this same formal- 
ism by considering each piece of an antenna as an inde- 
pendent element. For example, a single radio dish has 
B oc W with W given by equation (11) modulo polariza- 
tion complications. To properly compute the polariza- 



when | Ax | < 2R and |Ay| < 2R, zero otherwise. Writing Ax : 
r(cos <p, sin <p) and averaging over the azimuthal angle ip gives 



W(r) ■■ 



x 7T V* DID 

1-1 



- 1 + 



if r<D, 
if r> £>, 



(13) 

which is plotted in Figure 4. The synthesized beam is simply 

W(k)=j (Rk x )MRk y ), (14) 
and the azimuthal average of this function is plotted in Figure 3. 



tion response that is encoded in the matrix B, the full 
3-dimcnsional structure of the antenna and how it is con- 
nected to the two amplifiers must be taken into account, 
and the presence of nearby conducting objects affects B 
as well. 

For applications like CMB and 21cm mapping, where 
one wishes to measure a cosmological power spectrum, 
the key aspect of the synthesized beam that matters is 
how sensitive it is to different angular scales £ and their 
associated spherical harmonic coefficients. This response 
to different angular is encoded in the spherical harmonic 
expansion of the synthesized beam W(k x , k y ). If the syn- 
thesized beam is rotationally symmetric (or made sym- 
metric by averaging observations with different orienta- 
tions as Earth rotates), then its spherical harmonic coeffi- 
cients Wi m vanish except for m — 0, and we only need to 
keep track of the so-called beam function, the coefficients 
Bi = Wio plotted in Figure 4. In the flat-sky approxi- 
mation, this beam function for a rotationally symmetric 
synthesized beam reduces to the two-dimensional Fourier 



transform of W(fc. 
tribution W(Ax): 



k y ), which is simply the baseline dis- 



Bf ~ W (l/k) . 



(15) 



Figure 4 shows B( for the circular, square and Gaus- 
sian aperture cases mentioned above. WMAP and many 
other CMB experiments have published detailed mea- 
surements of their beam functions Bi (e.g., [26]), many 
of which are fairly well approximated by Gaussians. For 
interferometers, the beam functions can be significantly 
more interesting. Since B\ scales simply as the number 
of baselines at different separations, more complicated 
synthesized beams involving more than one scale can be 
designed if desirable. 



C. Sensitivity 

1. How the noise power spectrum is defined and normalized 

The sensitivity of an arbitrary telescope to signals on 
various angular scales is quantified by its noise power 
spectrum C" OISC . If the telescope were to make a uni- 
formly observed all-sky map, then Cf olsc would be the 
variance (due to detector noise) with which a spherical 
harmonic coefficient ag m could be measured. For a map 
that covers merely a small sky patch, the corresponding 
noise power spectrum is the C" OIse that would result if 
the whole sky were observed with this same sensitivity. 
Without loss of generality, we can factor the noise power 
spectrum as [27, 28] 



C? 



C noise r>- 
B l 



(16) 



where Bi is the beam function from the previous sec- 
tion, and Cq oisc is an overall normalization constant. 
To avoid ambiguity in this factorization, we normalize 
the beam function Bi so that its maximum value equals 
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Table 1 - Different measures of angular resolution, measured in units of D/X. 





rms FWHM First null 


Disk of diameter D 
Square of side D 
Gaussian with a = D 


0.53 1.03 1.22 
0.49 0.89 1.07 
1 2.35 oo 



unity. For a single dishe, the maximum is always at £ = 0. 
This gives the normalization Bo — 1, which given equa- 
tion (15), which means that the synthesized beam W(q) 
integrates to unity and that we can interpret the signal 
as measuring a weighted average of the true sky map. 
In all cases, our normalization Most interferometers have 
Bo = and thus no sensitivity to the mean; in many such 
cases, Be is roughly constant on angular scales l~ x much 
larger than the synthesized beam but much smaller than 
the primary beam, taking its maximum on these inter- 
mediate scales. 

This seemingly annoying lack of sensitivity to the mean 
is a conscious choice and indeed a key advantage of in- 
terferometers. The mean sensitivity can optionally be 
retained by simply including the antenna autocorrela- 
tions in the analysis (i.e., not explicitly setting the pixel 
at the origin of the (u, v) plane equal to zero), but this 
pixel normally contains a large positive bias due to noise 
that is difficult to accurately subtract out. In contrast, 
the noise in all other (it, v) pixels normally has zero 
mean, because the noise in different antennas is uncor- 
rected. Since single-dish telescopes cannot exclude this 
zero mode, they often require other approaches to miti- 
gate this noise bias, such as rapid scanning or d beam- 
switching. 



2. How it depends on experimental details 

Consider a telescope with total collecting area A ob- 
serving for a time r with a bandwidth Av around some 
frequency v = c/X. If this telescope performs a single 
pointing in some direction, then the noise power spec- 
trum for this observed region is [19]: 



C noise 




AVT 6 



sys 



"tAv 



(17) 



Here 7 is a dimensionless factor of order unity that de- 
pends on the convention used to define the telescope sys- 
tem temperature T sys ; below we simply adopt the con- 
vention where 7 = 1. For a single-dish telescope and 
for a maximally compact interferometer like the FFTT, 
ycovcr _ ^ p or an interferometer where the antennas 
are rather uniformly spread out over a larger circular 
area, y cover is the fraction of this area that they cover; if 
there are iV a antennas with diameter D in this larger area 
of diameter D max , we thus have / covor = N a (D / £> max ) 2 
and total collecting area A = N a n(D /2) 2 . For a gen- 
eral interferometer the noise power spectrum depends on 



the distribution of baselines and could be a complicated 
function of£. We are absorbing all ^-dependence into the 
beam function Bg as per equation (16). 

If instead of just pointing at a fixed sky patch, the tele- 
scope scans the sky (using Earth rotation and/or point- 
ing) to map a solid angle f2 map that exceeds its field-of 
view fl, and spends roughly the same amount of time cov- 
ering all parts of the map, then a given point in the map 
is observed a fraction fl/fl map of the time. The resulting 
noise power spectrum for the map is then 



C noise 




47T A 3 / sky r s 2 ys 

77 / covor After' 



(18) 



Here / SKy = fi map /47r is the fraction of the sky covered 
by the map, and we have introduced the dimensionless 
parameter r\ = Avjv = Avc/\ to denote the relative 
bandwidth. 



D. The 3D noise power spectrum P noisc 

For 21cm applications, it is also important to know 
the three-dimensional noise power spectrum the "data 
cube" mapped by treating the frequency as the radial 
direction (the higher the frequency, the larger the redshift 
and hence the larger the distance to the hydrogen gas 
responsible for the 21 cm signal). In a comoving volume 
of space subtending a small angle 9 <C 1 and a small 
redshift range Az/z <C 1 centered around z*, we can 
linearize the relation between the comoving coordinate r 
and the observed quantities {Q x ,6 y ,v) {e.g., [24]): 



Arj_ = d A (z*)A® 
Ar = y(z*)Av. 



(19) 
(20) 



Here A0 = (9 x ,0 y ) — (k x ,k y ) gives the angular distance 
away from the center of the field being imaged, and Arj_ 
is the corresponding comoving distance transverse to the 
line of sight. oIa(z) is the comoving angular diameter 
distance to redshift z, and 



y(z) = 



A 2 i(l + z) s 
H(z) 



(21) 



where A21 ~ 21 cm is the rest-frame wavelength of the 21 
cm line, and H{z) is the cosmic expansion rate at redshift 
z. In Appendix B, we show that these two conversion 
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functions can be accurately approximated by 

16.7Gpc 



<1a(z*) 

y(z) 



14.8Gpc- (1 + z)1/2 , 
A 2 i(l + z) 1/2 18.5Mpc fl 



(22) 



1 MHz 



10 



1/2 



(23) 



for the z » 1 regime most relevant to 21 cm tomography 
given the flat concordance cosmological parameter values 
fl m = 0.25 and H = 72 kms-^pc' 1 [29, 30]. 

If a 2-dimensional map is subdivided into pixels of area 
fipix and the noise is uncorrelated with variance a 2 in 
these pixels, then 



r noise ^.2/-) 
I — C7 l£pi x 



(24) 



for angular scales I well above the pixel scale. Analo- 
gously, if a 3-dimensional map is subdivided into pixels 
(voxels) of volume V p ^ and the noise is uncorrelated with 
variance a 2 in them, then 



a 2 V r 



pix 



(25) 



on length scales well above the pixel scale. Since the 
volume of a 3D pixel is V p [ x = (d\il p i K ) x (yAu), i.e., 
its area times its depth, combining equations (18), (24) 
and (25) gives the large-scale noise power spectrum 



4Trf s]iy \ 2 T 2 ys yd 2 A 

Aflf covcr T 



(26) 



When 2D and 3D power spectra are discussed in the 
cosmology literature, it is popular to introduce corre- 
sponding quantities 



(ST e ) 2 
A(k) 2 



'+!) 



2ir 
47rfc 3 
(2^)3 



Ce, 



P(k), 



(27) 
(28) 



which give the variance contribution per logarithmic in- 
terval in scale. One typically has STe <~ A(fc) when both 
the angular scale I and the bandwidth Av are chosen to 
match the length scale Ar — 2ixjk, i.e., when £ = k/d,A 
and Av — 2n/ky. Beware that here (and only here) we 
use k to denote the wavenumber of cosmic fluctuations, 
while everywhere else in this paper, we use it to denote 
the wave vector of electromagnetic radiation. 



1. Sensitivity to point sources 

It is obviously good to have a small noise power spec- 
trum C" olse and a large field of view. However, the trade- 
off between these two differs depending on the science 
goal at hand. Below we mention two cases of common 
interest. 



If one wishes to measure the flux <f> from an isolated 
point source, it is easy to show that the attainable accu- 
racy A(j) is 

-1/2 



£(2* + i)b| 



(29) 



In the approximation of a Gaussian beam Be = e 
with rms width 9 <§C 1, this simplifies to 



Ac)) « 6^AttC™ sc = 47rT sys 6 



A 3 /sky 



r r]CT 



T 



A 4 /sky 



sys V A 2 n T Au- 



(30) 



In the last step, we used the fact that the angular res- 
olution 9 ~ A/(A// covor ) 1/2 . The total information (in- 
verse variance) in the map about the point source flux 
thus scales as A 2 ^ItAv. That this information is propor- 
tional to the field of view f2, the observing time r and 
te bandwidth Av is rather obvious. That it scales like 
the collecting area as A 2 rather than A is because every 
baseline carries an equal amount of information about 
the flux <j>, and the number of baselines scales quadrati- 
cally with the area. It is independent of f COVCI because 
it does not matter how long the baselines are; therefore 
the result is the same regardless of where the antennas 
are placed. This last result also provides intuition for 
the / covor -factor in equation (17): since 6 2 Cf oisc is in- 
dependent of / covor and 9 oc A/(A// covor ) 1 /2 a ^/f^, 
we must have C £ noiso oc l// covor . As / covor drops and 
the same total amount of information is spread out over 
an area in the UV plane that is a factor l/J covcr larger, 
the information in any given £-bin that was previously 
observed must drop by the same factor, increasing its 
variance by a factor iy j covcr . 



2. Power spectrum sensitivity 

For CMB and 21 cm applications, one is interested in 
measuring the power spectrum Ct of the sky signal. The 
accuracy with which this can be done depends not only on 
C" OIse , but also on the signal Ce, itself (which contributes 
sample variance) and on the mapped sky fraction / s k y . 
The average power spectrum across a band consisting 
of At multipoles centered around £ can be measured to 
precision [31, 32] 



AC e 



(2£+l)A£f sky 



(Ce + Q noiso ) . (31) 



Since C" olse oc / s k y , there is an optimal choice of / s k y that 
minimizes ACe- In cases where / s k y < 1 is optimal, this 
best choice corresponds to C^ olsc <~ Ce, so that sample 
variance and noise make comparable contributions [31, 
32]. This means that optimized measurements tend to 
fall into one of three regimes: 
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1. No detection: ACe <; Ce even when / s k y is made 
as small as the telescope permits. Upper limit oc 

£<noisc 

2. Improvable detection: C £ noisc - C e , and ACe gc 

.Qy 2 ss (cr sc ) 1/2 - 

3. Cosmic variance limited detection: C" OIS0 <C Ce, 
and further noise reductions do not help. 

The regime depends normally depends on I, since Ce 
and C^ OIse tend to have different shapes. For example, 
the WMAP measurement of the unpolarized CMB is in 
regimes 1, 2 and 3 at I ~ 1000, I ~ 300 and I ~ 100, 
respectively. 



E. Field of view Q, 

The field of view of a telescope is the solid angle that 
it can map in a single pointing. For a telescope with a 
single dish of diameter D and a single receiver/detector 
pixel in its focal plane (a dish for satellite TV reception, 
say) , the receiver will simply map a sky patch correspond- 
ing to the angular resolution ~ X/D, giving ft <~ (X/D) 2 . 
The opposite extreme is to fill the entire focal plane with 
receivers, as is often done for, e.g., microwave and optical 
telescopes. In Appendix C, we show that the largest focal 
plane possible covers an angle of order (X/D) 1 / 3 , corre- 
sponding to fl ~ (X/D) 2 / 3 . This upper bound comes 
from the fact that the analog Fourier transform per- 
formed by telescope optics is only approximate. Many 
actual multi-receiver telescopes fall somewhere between 
these two extremes. In summary, single-dish telescopes 
have a field of view somewhere in the range 

We refer to the two extreme cases in this inequality as 
the single receiver telescope (SRT) and the maximal focal 
plane telescope (MFPT), respectively. 

Since the performs its Fourier transform with no ap- 
proximations, it can in principle observe the entire sky 
above the horizon, corresponding to Ct = 2ir. However, 
the useful field of view is only of order half of this, because 
the image quality degrades near the horizon: viewed from 
a zenith angle 0, one dimension of the telescope appears 
foreshortened by a factor cos 8, causing loss of both an- 
gular resolution and collecting area (and thus sensitivity) 
near the horizon. 



F. Cost 

Detailed cost estimates for telescopes are notoriously 
difficult to make, and will not be attempted here. We 



will instead limit our analysis to the approximate scal- 
ing of cost with collecting area, as summarized in Ta- 
ble 1, which qualitatively determines which telescopes are 
cheapest in the different parts of the parameter space of 
Figure 2. 

For a single-dish telescope, the cost usually grows 
slightly faster than linearly with area. Specifically, it has 
been estimated that the cost °S ^4 135 for radio telescopes 
[33]. 

For a standard interferometric telescope consisting of 
N separate dishes, the total cost for the dishes them- 
selves is of course proportional to N. However, the cost 
for the correlator hardware that computes the correla- 
tions between all the N(N — l)/2 pairs of dishes scales as 
A 2 , and thus completely dominates the total cost in the 
large A limit that is the focus of the present paper (al- 
ready at the modest scale of the MWA experiment, where 
A = 512, the A and N 2 parts of the hardware cost are 
comparable). For fixed dish size, the total collecting area 
A oc A so that the cost cx A 2 , f For an FFT Telescope, 
the cost of antennas, ground screen and amplifiers are 
all proportional to the number of antennas and hence to 
the area. As described in Section II, the computational 
hardware is also proportional to the area, up to some 
small logarithmic factors that we to first approximation 
can ignore. 

The above-mentioned approximate scalings are of 
course only valid over a certain range. All telescopes 
must have A > A 2 . The cost of single dishes grows more 
rapidly once their structural integrity becomes an issue 
for example, engineering challenges appear to make 
a single-dish radio telescope with A = 1 km 2 daunting 
with current technology 4 , and for an FFTT with diam- 
eter A 1 / 2 3> 10 km, compensating for Earth's curvature 
could become a major cost. 5 Finally, an all-digital tele- 
scope like the FFTT is currently limited to by Moore's 
law for computer processing speed to frequencies below 
a few GHz, and analog interferometry has not yet been 
successfully carried out above optical frequencies. 



4 However, an interesting design for which this might be feasible 
has been proposed in [34], where an almost fiat telescope rests 
close to the ground and the focal plane is carried by a stecrablc 
Helium balloon. 

5 If Earth were a perfect sphere of radius R 6400 km, then a 
planar telescope of radius r would be a height 

/ r \ 2 

h fa — si 8 m ( ) (33) 

above the ground at its edges. If the telescope is not planar, 
one cannot use the straightforward FFT analysis method. In 
practice, this might only be a problem if both of the dimensions of 
the FFTT are 2> 10 km: as long as the telescope can be kept flat 
in the narrowest dimension, it will have no intrinsic (Gaussian) 
curvature even if the the telescope has Earth's circular shape 
in its wide direction. An interesting question for future work is 
whether some algorithm incorporating an FFT along the long 
axis can be found that provides and exact and efficient recovery 
of the sky map for this generalized case. 
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G. Which telescope is best for what? 

Let us now put together the results from the previous 
subsections to investigate which telescope design is most 
cost effective for various science goals. 

We will use the noise power spectrum C^ OIse to quan- 
tify sensitivity. We will begin our discussion focusing 
on only two parameters, the large-scale sensitivity Cq° 1sc 
and the angular resolution 9, since the parametrization 
(jnoise _ £rnoise e # i j g a reasonaD i e approximation for 
many of the telescope designs that we have discussed. 
We then turn to more general noise power spectra when 
discussing elongated FFTs, general interferometers and 
the issue of point source subtraction. 



1. Complementarity 

If we need a telescope with angular resolution 9 and 
large-scale sensitivity Cq OIS6 , then which design will meet 
out requirements at the lowest cost? The answer is 
summarized in Figure 2 for a v = 150 MHz example. 
First of all, we see that SDTs, FFTTs and SITs and 
are highly complementary: the cheapest solution is of- 
fered by SDTs for low resolution, FFTTs for high sen- 
sitivity (C£ oiso ) 1/2 < 9 x 2^K, and standard interfer- 
ometers or elongated FFTTs for high resolution 9 < 
(C noiso ) 1 / 2 /2 A iK, . 



2. Calculational details 

A few comments are in order about how these results 
were obtained. 

For a single SRT, MFPT or FFTT, both the resolution 
and the sensitivity are determined by their area alone, so 
as the area is scaled up, they each trace a line through 
the {9, (Cg ' 30 ) 1 / 2 ) parameter space of Figure 2. The 
cheapest way to attain a better sensitivity at the same 
resolution is simply to build multiple telescopes of the 
same area (except for the FFTT, where cost°$ A, so that 
one might as well build a single larger telescope instead 
and get extra resolution for free). Since Cq oisc oc 1/NSl, 
where N is the number of telescopes whose images are av- 
eraged together, the sensitivity of an FFTT with a given 
resolution can be matched by building N — ^fftt 
telescopes, where N ~ A 1 / 3 /A 2/3 for the MFPT and 
N ~ A/A 2 for the SRT. The cost relative to an FFTT of 
the same resolution and sensitivity thus grows as A 0,65 for 
MFPTs and as A 133 for SRT's. The area below which 
single dish telescopes are cheaper depends strongly on 
wavelength; for the illustrative purposes of Figure 2, we 
have taken this to be (10m) 2 at 150 GHz based on crude 
hardware cost estimates for the GMRT [5] and MWA [1] 
telescopes. 

For regions to the right of the FFTT line in Figure 2, 
one has the option of either building a square (or circular) 



FFTT with unnecessarily high sensitivity to attain the 
required resolution, or to build an elongated FFTT or a 
conventional interferometer — we return to this below, 
and argue that the latter is generally cheaper. 

3. How the results depend on frequency and survey details 

Although the Figure 2 is for a specific example, these 
qualitative results hold more generally. Survey dura- 
tion, bandwidth, system temperature and sky coverage 
all merely rescale the numbers on the vertical axis, leav- 
ing the figure otherwise unchanged. As one alters the 
observing wavelength, the resolution and sensitivity re- 
mains the same if one alters the other scales accordingly: 
A oc A 2 , ct cx A, except that T sys grows rapidly towards 
very low frequencies as the brightness temperature of syn- 
chrotron radiation exceeds the instrument temperature. 
The cost depends strongly and non-lincarly on frequency. 
As discussed in Section IIIF, both the FFTT and digital 
SITs are currently feasible only below about 1 GHz, and 
and analog interferometry has not yet been successfully 
carried out above optical frequencies(?). 

4- The advantage of an FFTT over a single dish telescope 

The results above show that the FFTT can be thought 
of as simply a cheap single-dish telescope with a 180° field 
of view. Compared to single-dish telescope, the FFTT 
has two important advantages: 

1. It is cheaper in the limit of a large collecting area, 
with the cost scaling roughly like A rather than 
A or more. 

2. It has better power spectrum sensitivity even for 
fixed area A, because of a field of view that is larger 
by a factor between (L>/A) 2 / 3 and (D/X) 2 . 

An important disadvantage of the FFTT is that it cur- 
rently only works below a about 1 GHz. Even if it were 
not for this limitation, since the computational cost of 
interferometry depends on the number of resolution el- 
ements N ~ Q/9 2 , which grows fast toward higher fre- 
quencies (as W 3 for the MFPT and as v 2 for the FFTT), 
single-dish telescopes become comparatively more advan- 
tageous at higher frequencies. However, as Moore's law 
marches on, the critical frequency where an FFTT loses 
out to an SDT should grow exponentially over time. 

5. The advantage of an FFT Telescope over a traditional 
correlating interferometer 

The results above also show that the FFTT can be 
thought of as a cheap maximally compact interferometer 
with a full-sky primary beam. To convert a state-of- 
the-art interferometers such as MWA [1], LOFAR [2], 
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PAPER[4], 21 CM A [3] into an FFTT, one would need to 
do three things: 

1 . Move all antenna tiles together so that they nearly 
touch. 

2. Get rid of any beamformer that "points" tiles to- 
wards a specific sky direction by adding relative 
phases to its component antennas, and treat each 
antenna as independent instead, thus allowing the 
array to image all sky directions simultaneously. 

3. Move the antennas onto a rectangular grid to cut 
the correlator cost from N 2 to N log 2 N. 

This highlights both advantages and disadvantages of the 
FFTT compared to traditional interferometers. There 
are three important advantages: 

1. It is cheaper in the limit of a large collecting area, 
with the cost scaling roughly like A rather than A 2 . 

2. It has better power spectrum sensitivity even for 
fixed area A, because of a field of view that is larger 
than for an interferometer whose primary beam is 
not full sky (because its array elements arc either 
single-dish radio telescopes or antenna tiles that are 
pointed with beamformers) . 

3. The synthesized beam is as clean and compact as 
for a SDT, corresponding to something like a sim- 
ple Airy pattern. This has advantages for multifrc- 
quency point source subtraction as discussed below, 
and also for high fidelity mapmaking. 

The most obvious drawback of a square or circular FFTT 
is that the angular resolution is much poorer than what 
a traditional interferometer can deliver. This makes it 
unsuitable for many traditional radio astronomy applica- 
tions. We discuss below how this drawback can be partly 
mitigated by a rectangular rather than square design. 

A second drawback is the lack of flexibility in antenna 
positioning. Whereas traditional intcrfcromctry allows 
one to place the antennae wherever it is convenient given 
the existing terrain, the construction of a large FFTT 
requires bulldozing. 

6. The advantage of a 2D FFTT over a ID FFTT 
exploiting Earth Rotation 

There are two fundamentally different approaches to 
fully sampling a disk around the origin of the Fourier 
plane (usually referred to as the UV plane in the radio 
astronomy terminology): build a two-dimensional array 
(like a square FFTT) whose baselines cover this disk, or 
build a more sparse array that fills the disk gradually, af- 
ter adding together observations made at multiple times, 
when Earth rotation has rotated the available baselines. 
Equation (18) shows that, given a fixed number of anten- 
nas and hence a fixed collecting area, the former option 



gives lower Cq 01136 and hence more accurate power spec- 
trum measurements as long as the angular resolution is 
sufficient. The reason is that the factor J covcr in the 
denominator equals unity for the former case, and is oth- 
erwise smaller. For a rectangular FFTT of dimensions 
Dm_i n x -D m ax, B 2 f covcr depends on the angular scale i 
and it is easy to show that 

( 1 for £ < 2min 



In essence, making the telescope more oblong simply di- 
lutes the same total amount of information out over a 
broader range of i-spa.ee, thus giving poorer sensitivity 
on the angular scales originally probed. 

What telescope configuration is desirable depends on 
the science goal at hand. It has been argued [24, 35] 
that for doing cosmology with 21 cm tomography in the 
near term, it is best to make the telescope as compact as 
possible, i.e., to build a square or circular telescope. The 
basic origin of this conclusion is the result "a rolling stone 
gathers no moss" mentioned in Section III D 2: for power 
spectrum measurement, it is optimal to focus the efforts 
to make the signal-to-noise of order unity. The first gen- 
eration of experiments have much lower signal-to-noise 
than this, and thus benefit from focusing on large angu- 
lar scales and measuring them as accurately as possible 
rather than measuring a larger range of angular scales 
with even poorer sensitivity. Of course, none of these 1st 
generation telescopes were funded for 21cm cosmology 
alone, and their ability to perform other science hinges 
on having better angular resolution, explaining why they 
were designed with less compact configurations. Better 
angular resolution can also aid point source removal. 

For other applications where high angular resolution 
required, an oblong telescope is preferable. An interest- 
ing proposal of this type is the higher- frequency mapping 
proposed Pittsburgh Cylinder telescope [15, 16], which is 
one-dimensional. Instead rather omnidirectional anten- 
nas, it takes advantage of its one-dimensional nature by 
having a long cylindrical mirror, which increases the col- 
lecting area at higher frequencies. This is advantageous 
because its goal is to map 21 cm emission at the lower 
rcdshifts (higher frequencies > 200 MHz) corresponding 
to the period after cosmic reionization, to detect neutral 
hydrogen in galaxies and use this to measure the baryon 
acoustic oscillation scale as a function of redshift. If one 
wishes to perform rotation synthesis with an oblong or 
ID FFTT, it will probably be advantageous to build mul- 
tiple telescopes rotated relative to one another (say in an 
L-shaped layout, or like spokes on a wheel), to reduce the 
amount of integration time needed to fill the UV plane. 
Cross-correlating the N antennas between the telescopes 
would incur a prohibitive N 2 computational cost, so such 
a design with T separate telescopes would probably need 
to discard all but of a fraction 1/T if the total informa- 
tion, corresponding to the intra-telescope baselines. 
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Another array layout giving higher resolution is to 
build an array whose elements consist of FFTTs placed 
far apart. After performing a spatial FFT of their indi- 
vidual outputs, these can then be multiplied and inverse- 
transformed pairwise, and the resulting block coverage of 
the UV plane can be filled in by Earth rotation. As long 
as the number of separate FFTTs is modest, the extra 
numerical cost for this may be acceptable. 

Above we discussed the tradeoff between different 
shapes for fixed collecting area. If one instead replaces 
a D x D two-dimensional FFTT by a one-dimensional 
FFTT of length D using rotation synthesis, then equa- 
tion (18) shows that one loses sensitivity in two separate 
ways: at the angular scale £ ~ D/ X where the power spec- 
trum error bar ACe from equation (31) is the smallest, 
one loses one factor of D/ X from the drop in J covcr ; and 
a second factor of D/X from the drop in collecting area 
A. Another way of seeing this is to note that the avail- 
able information scales as the number of baselines, which 
scales as the square of the number of antennas and hence 
as A 2 . This quadratic scaling can also be seen in equa- 
tion (30): the total amount of information (A0) -2 scales 
as A 2 Q,tAv, so whereas field of view, observing time and 
bandwidth help only linearly, area helps quadratically. 
This is because we can correlate electromagnetic radia- 
tion at different points in the telescope, but not at differ- 
ent times, at different frequencies or from different points 
in the sky. The common statement that the information 
gathered scales as the etendu Ail is thus true only at 
fixed t; when all angular scales are counted, the scaling 
becomes A 2 fl. 

If in the quest of more sensitivity, one keeps length- 
ening an oblong or one-dimensional FFT to increase the 
collecting area, one eventually hits a limit: the curvature 
of Earth's surface makes a flat D 3> 10km exceedingly 
costly, requiring instead telescope curving along Earth's 
surface and the alternative analysis framework mentioned 
above in Section III F. If one desires maximally straight- 
forward data analysis, one thus wants to grow the tele- 
scope in the other dimension to make it less oblong, as 
discussed in Section III F. This means that if one needs 
» 10 4 antennas for adequate 21 cm cosmology sensitiv- 
ity, one is forced to build a 2D rather than ID telescope. 
For comparison, even the currently funded MWA exper- 
iment with its 512 x 4 2 = 8192 antennas is close to this 
number. 

One final science application where 2D is required 
is the study of transient phenomena that vary on a 
time scale much shorter than a day, invalidating the 
static sky approximation that underlies rotation synthe- 
sis. This was the key motivation behind the aforemen- 
tioned Wascda telescope [10-12]. 



IV. APPLICATION TO 21 CM TOMOGRAPHY 

In the previous section we discussed the pros and cons 
of the FFTT telescope, and found that it's main strength 



is for mapping below about 1 GHz when extreme sensi- 
tivity is required. This suggests that the emerging field 
of 21 cm tomography is an ideal first science applica- 
tion of the FFTT: it requires sky mapping in the sub- 
GHz frequency range, and the sensitivity requirements, 
especially to improve cosmic microwave background con- 
straints on cosmological parameters, are far beyond what 
has been achieved in the past [24, 37-39]. 



A. 21cm tomography science 

It is becoming increasingly clear that 21 cm tomog- 
raphy has great scientific potential for both astrophysics 
[18-21, 35] and fundamental physics [24, 36-39]. The ba- 
sic idea is to produce a three-dimensional map of the mat- 
ter distribution throughout our Universe through preci- 
sion measurements of the redshifted 21 cm hydrogen line. 
For astrophysics, much of the excitement centers around 
probing the cosmic dark ages and the subsequent epoch 
of reionization caused by the first stars. Here we will 
focus mainly on fundamental physics, as this arguably 
involves both the most extreme sensitivity requirements 
and the greatest potential for funding extremely sensitive 
measurements. 




FIG. 5: 21 cm tomography can potentially map most of 
our observable universe (light blue/gray), whereas the CMB 
probes mainly a thin shell at z tn 1100 and current large- 
scale structure maps (here exemplified by the Sloan Digital 
Sky Survey and its luminous red galaxies) map only small 
volumes near the center. Half of the comoving volume lies at 
z > 29 (Appendix B). This paper focuses on the convenient 
7 < z < 9 region (dark blue/grey). 
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1. Three physics frontiers 

Future measurements of the redshifted 21 cm hydro- 
gen line have the potential to probe hitherto unexplored 
regions of parameter space, pushing three separate fron- 
tiers: time, scale, and sensitivity. Figure 5 shows a scaled 
sketch of our observable Universe, our Hubble patch. It 
serves to show the regions that can be mapped with var- 
ious cosmological probes, and illustrates that the vast 
majority of our observable universe is still not mapped. 
We are located at the center of the diagram. Galaxies 
(from the Sloan Digital Sky Survey (SDSS) in the plot) 
map the distribution of matter in a three dimensional 
region at low redshifts. Other popular probes like grav- 
itational lensing, supernovae la, galaxy clusters and the 
Lyman a forest are currently also limited to the small 
volume fraction corresponding to redshifts < 3 or less, 
and in many cases much less. The CMB can be used 
to infer the distribution of matter in a thin shell at the 
so-called "surface of last scattering" , whose thickness cor- 
responds to the width of the black circle at z ~ 1100 and 
thus covers only a tiny fraction of the total volume. The 
region available for observation with the 21 cm line of 
hydrogen is shown in light blue/grey. Clearly the 21 cm 
line of hydrogen has the potential of allowing us to map 
the largest fraction of our observable universe and thus 
obtain the largest amount of cosmological information. 

At the high redshift end (z > 30) the 21 cm signal 
is relatively simple to model as perturbations are still 
linear and "gastrophysics" related to stars and quasars 
is expected to be unimportant. At intermediate times, 
during the epoch of reionization (EOR) around redshift 
z ~ 8, the signal is strongly affected by the first genera- 
tion of sources of radiation that heat the gas and ionize 
hydrogen. Modeling this era requires understanding a 
wide range of astrophysical processes. At low redshifts, 
after the epoch of reionization, the 21 cm line can be used 
to trace neutral gas in galaxies and map the large scale 
distribution of those galaxies. 



2. The time frontier 



mentioned measurements of H(z) and clustering 
growth. 

• Constraints on decay or annihilation of dark mat- 
ter particles, or any other long-lived relic, from the 
above-mentioned measurement of our thermal his- 
tory [40-42]. Here 21cm is so sensitive that even 
the expected annihilation of "vanilla" neutralino 
WIMP cold dark matter may be detectable [42]. 

• Constraints on evaporating primordial black holes 
from the thermal history measurement [43] . 

• Constraints on time- variation of fundamental phys- 
ical constants such as the fine structure constant 
[44]. 



3. The scale frontier 
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Figure 5 illustrates that observations of the 21 cm line 
from the EOR and higher redshifts would map the distri- 
bution of hydrogen at times where we currently have no 
other observational probe, pushing the redshift frontier. 
Measurements of the 21 cm signal as a function of red- 
shift will constrain the expansion history of the universe, 
the growth rate of perturbations and the thermal history 
of the gas during an epoch that has yet to be probed. 

• Tests of the standard model predictions for our cos- 
mic thermal history T(z), expansion history H{z) 
(which can be measured independently using both 
expansion and the angular diameter distances), and 
linear clustering growth. 

• Constraints on modified gravity from the above- 



FIG. 6: 21 cm tomography can push the scale frontier far be- 
yond that of current measurements of cosmic clustering, po- 
tentially all the way down to the Jeans scale at the right edge 
of the figure. This allows distinguishing between a host of al- 
ternative inflation and dark matter models that are consistent 
with all current data, for example a warm dark matter with 
mass 14 keV (dashed curve) or greater and inflation with a 
running spectral index more extreme than dn 3 /dln k — —0.03 
(dotted). 

These observations can potentially push the "scale 
frontier" , significantly extending the range of scales that 
are accessible to do cosmology. This is illustrated in fig- 
ure 6, where the scales probed by different techniques are 
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compared to what is available in 21 cm. Neutral hydro- 
gen is a good probe of the small scales for two separate 
but related reasons. First, one can potentially make ob- 
servations at higher redshifts, where more of the scales of 
interest are in the linear regime and thus can be better 
modeled. Second, at early times in the history of our 
Universe, hydrogen is still very cold and thus its distri- 
bution is expected to trace that of the dark matter up to 
very small scales, the so-called Jeans scale, where pres- 
sure forces in the gas can compete with gravity [45]. 

• Precision tests of inflation, since smaller scales pro- 
vide a longer lever arm for constraining the spec- 
tral index and its running (illustrated in Figure 6) 
for the power spectrum of inflationary seed fluctu- 
ations [24] 

• Precision tests of inflation by constraining small- 
scale non-Gaussianity [46]. 

• Precision constraints on non-cold dark matter from 
probing galactic scales while they were still linear. 
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4- The sensitivity frontier 

This combination of a large available volume with the 
presence of fluctuations on small scales that can be used 
to constrain cosmology implies that the amount of infor- 
mation that at least in principle can be obtained with 
the 21 cm is extremely large. This can be illustrated 
by calculating the number of Fourier modes available to 
do cosmology that can be measured with this technique. 
This number can be compared with the number of modes 
measured to date with various other techniques such as 
galaxy surveys, the CMB, etc. In figure 7, we show the 
number of modes measured by past surveys and some 
planned probes including 21 cm experiments 6 . The fig- 
ure illustrates a trend akin to Moore's law: exponential 
progress as a function of year. It is striking that the 
improvement of the 1 km 2 FFTT over WMAP is com- 
parable to that of WMAP over COBE. Moreover, the 
ultimate number of modes available to be observed with 
21 cm tomography is dramatically larger still, upward 



Although the number of modes gives an estimate of the statis- 
tical power of a survey, constraints on specific parameters will 
depend on how strongly each of the power spectra varies as a 
function of the parameter of interest. Furthermore, when con- 
sidering probes such as the Lyman-a forest that probes modes 
in the non-linear regime, our numbers based on the Gaussian 
formula overestimates the constraining power. In constructing 
this figure, only modes in the linear regime k < 0.1 h Mpc — 1 
were included for galaxy surveys. These are the range of modes 
that are typically used for doing cosmology. If the galaxy forma- 
tion process becomes sufficiently well understood it may become 
feasible to increase the number of useful modes. 



FIG. 7: Number of modes measured with different cosmolog- 
ical probes. We show illustrative examples of galaxy redshift 
surveys (CfA, PsCz, 2dF, SDSS main sample (SMS), SDSS 
Luminous red galaxies (SLRG)), CMB experiments (COBE, 
WMAP and Planck), Lyman-cv forest measurements (using 
high resolution spectra (HLa) and SDSS spectra (SLa)) and 
21 cm experiments (MWA, an extension of MWA with ten 
times the collecting area, the Square Kilometer Array (SKA) 
and a 1 km 2 FFTT). The number of modes is calculated from 
the constraints these experiments can place on the overall 
amplitude of the power spectrum (SP/P) and then using the 
formula for Gaussian random fields SP/P — ^/2/7V mo d C s. 



of 10 16 , so although many practical issues will most cer- 
tainly limit what can be achieved in the near future, the 
ultimate potential is vast. 

The FFTT sensitivity improvement translates into bet- 
ter measurement accuracy for many of the usual cos- 
mological parameters. It has been shown that even 
the limited redshift range 7 < z < 9 (dark shading in 
Figure 5) has the potential to greatly improve on cos- 
mic microwave background constraints from WMAP and 
Planck: it could improve the sensitivity to spatial curva- 
ture and neutrino masses by up to two orders of magni- 
tude, to Aft fe w 0.0002 and Am„ w 0.007 eV, and give a 
4cr detection of the spectral index running predicted by 
the simplest inflation models [24]. Indeed, it may even 
be possible to measure three individual neutrino masses 
from the scale and time dependence of clustering [24, 47] . 

Measuring the 21 cm power spectrum and using it to 
constrain physics and astrophysics does not require push- 
ing the noise level down to the signal level, since the 
noise can be averaged down by combining many Fourier 
modes probing the same range of scales. This is analo- 



gous to how the COBE satellite produced the first mea- 
surement of the CMB power spectrum even though in- 
dividual pixels in its sky maps were dominated by noise 
rather than signal [48]. Further boosting the sensitivity 
to allow imaging (with signal-to-noise per pixel exceeding 
unity) allows a number of improvements: 

• Improving quantification, modeling and under- 
standing of foregrounds and systematic errors 

• Pushing down residual foregrounds with better 
cleaning (like in the CMB field, the residual fore- 
ground level after cleaning is likely to be compara- 
ble to the noise level) 

• Enabling power spectrum and non-Gaussianity es- 
timation after masking out ionized bubbles, thus 
greatly reducing the hard-to-model "gastrophysics" 
contribution 

• Constraining small-scale properties of dark matter 
by using 21 cm maps as backgrounds for gravita- 
tional lensing experiments that could detect the 
presence of dark substructure in lower redshifts ha- 
los [49-51] 

• Pushing to higher redshift where the physics is sim- 
pler 



B. The cost of sensitivity 

There is thus little doubt that sensitivity improve- 
ments can be put to good use. Equation (26) implies 
that the high-redshift frontier in particular has an al- 
most insatiable appetite for sensitivity: since A oc (1 + z), 
y oc (1 + z) 1 / 2 , d,A depends only weakly on z, and the dif- 
fuse synchrotron foreground that dominates T sys at low 
frequencies scales roughly as v~ 2S oc (1 + z) 2 6 in the 
cleanest parts of the sky for 50 < v < 200 MHz [52], 
equation (26) gives a sensitivity 

1,3/2/1 , \3.85 fl/2 

ST oc [fcapnoisc] i/ 2 K k + Jfy . (35) 

if the observing time and field of view is held fixed (like 
for the FFTT). Pushing from z = 9 to z = 20 with 
the same sensitivity thus requires increasing the collect- 
ing area by a factor around 300. This would keep the 
signal-to-noise level roughly the same if the 21 cm fluc- 
tuation amplitude is comparable and peaks at similar 
angular scales at the two redshifts, as suggested by the 
calculations of [23]. Equation (35) shows that imaging 
smaller scales is expensive too, with an order of mag- 
nitude smaller scales (multiplying k by 10) requiring a 
thousandfold increase in collecting area. 
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FIG. 8: The rough hardware cost in 2008 US Dollars of at- 
taining various sensitivities at the k = 0.1/Mpc scale with the 
FFTT telescope (green curves), a maximally compact regu- 
lar interferometer (blue curves) and a single-dish telescope 
(red curve) always pointing towards the same patch of sky 
(47r/ s kyfi). The dashed curves have angular resolution poorer 
than I = 500 at redshift z = 9; for the SIT and FFTT, 
this resolution can be achieved by making the telescope array 
oblong a higher cost (solid curves), since the area must be in- 
creased to compensate for the drop in J covcr . Note that cost 
is a function of A only so to plot is as a function of sensitivity 
Af COVCI a design dependent relation between A and / cover is 
required. 

Figures 8 and 9 illustrate the rough cost of attaining 
the sensitivity levels required for various physics mile- 
stones mentioned above. Our cost estimates are very 
crude, and making more accurate ones would go beyond 
the scope of the present paper, but the qualitative scal- 
ings seen in the figures should nonetheless give a good 
indication of how the different telescope designs comple- 
ment each other. 7 For our estimates, we have assumed 
t = 4000 hours of observation with a system temperature 
T sys = 200Kx[(l + 2)/10] 2 6 . We assume the cosmic sig- 



For interferometer arrays, we use the following hardware 
cost estimate loosely based on the MWA hardware budget 
[1]: $lMx(A/8000m 2 )/3+$lMx( J 4/8000m 2 )T, where Q3, 7) = 
(1.2,1) for the FFTT and Q3, 7) = (1,2) for a conventional in- 
terferometer. The first term corresponds to per-antenna costs 
(with p reflecting the extra construction cost related to land 
leveling etc.), and the second term corresponds to the compu- 
tational cost. For a single dish, we assume a hardware cost 
$0.4Mx[A/(1600m 2 )] 135 based on Wilkinson's scaling [33] from 
the GMRT budget [5]. 



FIG. 9: Same as previous figure, but when half the sky is 
mapped (/sky = 2tt). The standard interferometric telescope 
(SIT) and single dish maximal focal plane telescope (MFPT) 
take an additional cost his here, needing to further increase 
the area to compensate for the drop in field of view Q with 
A. 

nal to be of order ST = 5mK at the redshifts of interest 
[23]. Our baseline estimates are an observing frequency 
of 142 MHz, corresponding to 21 cm emission at redshift 
z = 9. 

Figure 8 is for the case when all we care about is sen- 
sitivity, not how large a sky area is mapped with this 
sensitivity. We thus keep the telescope pointing at the 
same sky patch and get 47r/ sky ^, so equation (35) gives 
a sensitivity ST cx (fc 3 / 2 (l + z) 3 - 85 )/(A/ covcr ) 1/2 . For a 
fixed spatial scale k and redshift z, the sensitivity thus de- 
pends only on the collecting effective area f covel A plotted 
on the horizontal axis. The solid curves in the figure all 
have maximally compact configurations with / covcr = 1 ; 
corresponding to angular resolution I ~ A 1 / 2 /A. The 
lines are dotted where this resolution £ < 500 for the 
baseline wavelength A = 2.1m. If we insist on the higher 
resolution £ = 500, we can achieve this goal by mak- 
ing the FFT or SIT oblong or otherwise sparse, with 
jcovcr _ A/(X£f oc A, so in this regime, (A/ covcr ) cx A 2 
and hence A oc (A/ 00 * 61- ) 1 ' 2 , anc j this area in turn deter- 
mines the cost — this is why the solid curves in Figure 8 
lie above the corresponding dotted ones. 

Figure 9 is for the case when we want a map of a 
fixed area (WMAP-style) , in this case covering half the 
sky (/sky = 0.5), so equation (35) gives a sensitivity 
ST oc (/c 3 / 2 (l + z) 3 - 85 )/(An/ cover )V2. For a fixed spa _ 
tial scale and redshift, the sensitivity thus depends only 
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on the collecting effective ctendu f COVCI Afl plotted on 
the horizontal axis. Since f2 drops with area for both 
the SRT and MFPT, in order to boost sensitivity, these 
telescopes now need an extra area boost to make up for 
the drop in fi. Although an MFPT has costoc A 135 , it 
also has tt oc A -1 / 3 , so that AO, oc A 2 / 3 and the cost 
oc (AQ) 1 - 35x3 / 2 w (Afl) 2 . Once A is large enough to give 
sufficient resolution (A > X 2 £ 2 ) it becomes smarter to 
simply build multiple telescopes, giving cost oc A. 

For comparison, we have indicated some sensitivity 
benchmarks as vertical lines. Equation (35) shows that 
5T £ fc 3 / 2 (l + z) 3 85 ; this redshift scaling is illustrated 
by these vertical lines. Additional sensitivity can also be 
put to good use for probing smaller scales, since an order 
of magnitude change in k corresponds to three orders of 
magnitude on the horizontal axis. 



C. 21 cm foregrounds 

Aside from its extreme sensitivity requirements, an- 
other unique feature of 21 cm cosmology is the magnitude 
of its foreground problem: it involves mapping a faint 
and diffuse cosmic signal that needs to be separated from 
foreground contamination that is many orders of magni- 
tude brighter [20, 22, 52], requiring extreme sensitivity 
and beam control. Fortunately, the foreground emission 
(mainly synchrotron radiation) has a rather smooth fre- 
quency spectrum, while the cosmological signal varies 
rapidly with frequency (corresponding to variations in 
physical conditions along the line of sight). Early work 
on 21cm foregrounds [53-55] has indicated that this can 
be exploited to clean out the foregrounds down to an ac- 
ceptable level, effectively by high-pass filtering the data 
cube in the frequency direction. 

However, these papers have generally not treated 
the additional complication that the synthesized beam 
W(9 x ,6 y ) is frequency dependent, dilating like A, which 
means that when raw sky maps at two different frequen- 
cies cannot be readily compared. For a single-dish tele- 
scope or an FFTT, the synthesized beam is compact and 
simple enough that this complication can be modeled and 
remedied exactly (say by convolving maps at all frequen- 
cies to have the same resolution before foreground clean- 
ing), but for a standard interferometer, complicated low- 
level "frizz" extending far from the central parts of the 
synthesized beam appears to make this unfeasible at the 
present time. Recent work [56-58] has indicated that this 
is a serious problem: whereas the foreground emission 
from our own galaxy is smooth enough that these off- 
beam contributions average down to low levels, emission 
from other galaxies appears as point sources to which the 
telescope response varies rapidly with frequency because 
of the beam dilation effect. The ability to mitigate this 
problem is still subject to significant uncertainty [58], and 
may therefore limit the ultimate potential of 21 cm cos- 
mology with a conventional interferometer. The ability 
to deal with foreground contamination is thus another 



valuable advantage of the FFT Telescope. 

V. CONCLUSIONS 

We have presented a detailed analysis of an all-digital 
telescope design where mirrors are replaced by fast 
Fourier transforms, showing how it complements con- 
ventional telescope designs. The main advantages over 
a single dish telescope are cost and orders of magni- 
tude larger ficld-of-vicw, translating into dramatically 
better sensitivity for large-area surveys. The key advan- 
tages over traditional interferometers are cost (the cor- 
relator computational cost for an TV-element array scales 
as iVlog 2 iV rather than TV 2 ) and a compact synthesized 
beam. These traits make the FFT Telescope ideal for 
applications where the angular resolution requirements 
are modest while those on sensitivity are extreme. We 
have argued that the emerging field of 21 cm tomogra- 
phy could provide an ideal first application of a very large 
FFT Telescope, since it could provide massive sensitivity 
improvements per dollar as well as mitigate the off-beam 
point source foreground problem with its clean beam. 

A. Outstanding challenges 

There are a number of interesting challenges and de- 
sign questions that would need to be addressed before 
building a massive FFT Telescope for 21 cm cosmology. 
For example: 

1. To what extent can the massive redundancy of 

an FFT Telescope (where the same baseline is 

l li 

typically measured by ~ N a independent an- 
tenna pairs) be exploited to calibrate the antennas 
against one another in a computationally feasible 
way? 

2. To what extent, if any, are more distant antennas 
outside the FFTT needed to resolve bright point 
sources and calibrate the FFTT antennas? 

3. After calibration, how do gain fluctuations in the 
individual array elements affect the noise properties 
of the recovered sky map? 

4. How do variations in primary beam B(k) from 
equation (1) from between individual antennas af- 
fect the properties of the recovered sky map? 

5. How many layers of dummy antennas are needed 
around the active instrumented part of the array 
to ensure that the beam patterns of all utilized an- 
tennas are sufficiently identical? 

6. What antenna design is optimal for a particu- 
lar FFT Telescope science application, maximizing 
gain in the relevant frequency range? The limit of 
an infinite square grid of antennas on an infinite 
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ground screen is quite different from the limit of a 
single isolated antenna, and modeling mutual cou- 
pling effects becomes crucial when computing the 
primary beam B(k) from equation (1) 

7. What unforeseen challenges does the FFT Tele- 
scope entail, and how can they be overcome? 

8. Can performing the first stages of the spatial FFT 
by analog means (say connecting adjacent 2 x 2 or 
4x4 antenna blocks with Butler matrices [8]) lower 
the effective system temperature in parts of the sky 
with overall lower levels of synchrotron emission? 

Answering these questions will require a combination of 
theoretical and experimental work. The authors are cur- 
rently designing a small FFTT prototype with a group 
of radio astronomy colleagues to address these questions 
and to identify unforeseen obstacles. 



B. Outlook 

Looking further ahead, we would like to encourage the- 
orists to think big and look into what additional physics 
may be learned from the sort of massive sensitivity gains 
that an FFTT could offer, as this can in turn increase 
the motivation for hard work on experimental challenges 
like those listed above. 

Perhaps in a distant future, almost all telescopes will 
be FFT Telescopes, simultaneously observing light of all 
wavelengths from all directions. In the more immediate 
future, as Moore's law enables FFTT's with higher band- 
width, cosmic microwave background polarization may 
be an interesting application besides 21 cm cosmology. 
By using an analog frequency mixer to extract of order a 
GHz of bandwidth in the CMB frequency range (around 
say 30 GHz or 100 GHz), it would be possible to obtain 
a much greater instantaneous sky coverage than current 
CMB experiments provide, and this gain in could out- 
weight the disadvantage of lower bandwidth Av in equa- 
tion (18) to provide overall better sensitivity. The fact 
that extremely high spectral resolution would be avail- 
able essentially for free may also help ground-based mea- 
surements, allowing exploitation of the fact that some 
atmospheric lines are rather narrow. 
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APPENDIX A: POLARIZATION ISSUES 

The Stokes matrix S defined by equation (6) is related 
to the usual Stokes parameters /, Q, U, V by 



2\U + iV I-Q I 2 



In the dot product, 

v = (/ Q U V) 

and 

a = 



(A2) 



(A3) 



contains the four Pauli matrices. As usual, I denotes 
the total intensity, Q and U quantify the linear polar- 
ization and V the circular polarization (which normally 
vanishes for astrophysical sources). It is easy to invert 
equation (Al) to solve for the Stokes parameters: 




v = tr{cr-S}. 



(A4) 



An annoying but harmless nuisance when dealing with 
large-area polarization maps is the well-known fact that 
"you can't comb a sphere", i.e., that there is no global 
choice of reference vector to define the Jones vector and 
the Stokes parameters (Q, U) all across the sky. In prac- 
tice, it never matters until at the very last analysis step, 
since one can collect the data and reconstruct both S b 
and Sb without worrying about this issue. To compute 
B and solve for the Stokes parameters, any convention 
for defining the Stokes parameters will suffice, even one 
involving separate schemes for a number of partially over- 
lapping sky patches; it is easy to see that the choice of 
convention has no effect on the accuracy or numerical 
stability of the inversion method. 



APPENDIX B: COSMIC GEOMETRY 

In this Appendix, we derive equations (22) and (23). 
For a flat universe (which is an excellent approximation 
for ours [29, 30]), the comoving angular diameter distance 
is given by [59] 



d A (z)= I 
Jo 



cdz' 



where 



H(z) = Hoy/Sl m (l + z) 3 + Sl A , 



(Bl) 



(B2) 
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where J7a = 1 — ^m- The second term in the square root 
becomes negligible for z » (Oa/1 — f^) 1 / 3 — 1 w 0.4 for 
Q m = 0.25 [30], which gives equation (23). The dark en- 
ergy density is completely negligible at the high redshift 
regime relevant to 21 cm cosmology also in most mod- 
els where this density evolves with time. For such high 
redshifts, we can therefore approximate equation (Bl) as 
follows: 



Hoc 



3.56 



L Q^ /2 (l + 2')3/2 

4 



(1 + Z)V2 



(B3) 



for f2 m = 0.25, which gives equation (22). The accuracy 
of Equation (B3) better than 1% for z > 2.2, i.e., bet- 
ter that with which the relevant cosmological parameters 
have currently been measured. 

Equation (B3) shows that, surveying our observable 
universe as illustrated in Figure 5, we reach half the co- 
moving distance at z w (4/(3.56 x 0.5)) 2 — 1 as 4 and half 
the comoving volume at z w {4/[3.56 x (1 - 0.5 1/3 )]} 2 - 
1 w 29. . 



APPENDIX C: FIELD-OF-VIEW ESTIMATES 

In this appendix, we derive the restriction on the field 
of view for a single dish telescope. Consider a parabolic 
mirror of height z given by: 



z = 



x 2 +y 2 
R 



(CI) 



where x and y are the coordinates in the plane of the 
ground and R determines the radius of curvature. The 
mirror has a diameter D such that 



2 2 D 2 

x 2 +y 2 < —. 



(C2) 



We consider radiation initially traveling with wave vec- 
tor k = fc(sin#, 0, cos6>) with k = 2tt/X. We will calcu- 
late the phase of the radiation that scatters at the lo- 
cation (x,y,z) = (p cos 0, p sin 0, p 2 / R) on the surface 
of the mirror and then arrives at a detector located at 
(xf,yf,Zf). For simplicity we will consider a point in 
the mirror with y — so that = and then also set 
yf = 0. After some simple algebra one obtains the fol- 
lowing expression for the phase ip: 



ip = k 



\x f -p) 2 + 



Zf ~R 



p{p cos 9 + R sin ( 
R 



(C3) 

Because of the parabolic shape chosen for the mirror, the 
phase of radiation coming with normal incidence (9 = 0) 
comes to a perfect focus at Xf = 0, Zf = R/A. By perfect 
focus we mean that the phase ip(xf, Zf,p) is independent 



of p for Xf = 0, Zf = z/4, 9 = For radiation incident 
at an angle, there will be no point in space where one 
can locate the detector so that the radiation reflected 
everywhere in the mirror will be in phase. We will find 
the field of view of the telescope by demanding that the 
phase difference between radiation incident in different 
parts of the telescope be less than a radian at the location 
of the detector. 

To obtain a formula, we expand ip in a Taylor series as 
a function of p. By choosing Xf and Zf,we can make the 
terms linear and quadratic in p vanish, but the cubic term 
will in general be non-zero, except for normal incidence. 
For a given telescope diameter we will then find the field 
of view by demanding that the cubic contribution to the 
phase be smaller than a radian. The Taylor series of ip 
is: 



y/xf 2 + Zf 2 



■sin a \ p 



+ 



z f ((R-2z f )z f -2x f 2 ) . 2 
{xf2+Zj2)3/ 2 2costf ] p 

2R 



+ XfZf((R-2zf)z f -2xf 2 )p 3 | 
2R(xf 2 +z f 2 f 2 

By choosing 

Xf = — Zf tan# w — Zf9, (C5) 
we can eliminate the term linear in p, and by choosing 



z / = |(cos20 + l)«| 
the quadratic one. Thus we get 

ipttk {-Rcos9 - J —g L + 



(C6) 



(C7) 



For small values of 9, demanding that ip changes by less 
than a radian as we move from the center to the edge of 
the telescope, and using k = 2ir/\, we obtain, 



< 



R 2 X 

D 3 TT 



R 2 



A 



D 2 ir X D 



(C8) 



Thus by increasing the radius of curvature R, one can 
increase the field of view. In fact, (R/D) 2 basically gives 
the number of resolution elements in each linear dimen- 
sion in the focal plane. 

The upper bound on the size on the curvature radius 
comes from demanding that the focal plane not cover 
the entire telescope. Using equations (C5) and (C6) and 
demanding that the size of the focal plane be smaller than 
D/2 (a very conservative assumption), we get another 
constraint on the field of view: 



< 



2D 
~R~' 



(C9) 
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While the size of the field of view increases with R in 
(C8), it decreases with R in (C9) and thus the largest 
field of view is obtained when both constraints are equal, 
and corresponds to 



R 



9 < 



4A 

7T~D 



1/3 



(CIO) 



The inequality that we have been derived can be pushed 
somewhat with clever multi-mirror designs (for example, 



the optical large synoptic telescope uses three mirrors 
[60]). In contrast, radio telescopes typically use only one 
mirror. In this case, the value of R required to attain 
the maximal field of view that we have derived is a factor 
(2ttD/X) 1 ^ 3 larger than D and can thus get very large for 
sufficiently small wavelengths. Mechanical constraints 
can make building such a radio telescope impractical as 
the focal plane would be very far away from the tele- 
scope, making the upper bound 6 < (A/D) 1 / 3 that we 
have derived for the field of view a rather conservative 
one. 
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